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ABSTRACT 


This  paper  provides  an  overview  of  the  A  major  aspects  of  the  PIXIE  project, 
namely:  the  field  work  undertaken  to  determine  how  teachers  diagnose  and 
remediate  in  introductory  algebra;  the  set  of  experiments  run  to  determine 
the  relative  effectiveness  of  Model-Based-Remediation  (MBR)  and  Reteaching; 
systems  work  carried  out  to  remedy  shortcomings  noted  earlier  in  the  Intelli¬ 
gent  Tutoring  System,  PIXIE;  and  an  experiment  run  to  determine  whether  it  is 
possible  to  enhance  teachers'  diagnostic  capabilities.  (More  detailed  discus¬ 
sions  of  each  of  these  topics  are  provided  in  A  separate  technical  reports). 

The  major  conclusions  from  the  four  phases  of  the  work  are: 

Field  work:  the  teachers  involved  in  the  study,  tutored  algebra  essentially 
procedurally. 

Relative  effectiveness  of  MBR  and  Reteaching:  for  algebra  when  taught  pro¬ 
cedurally  with  this  age  group,  Reteaching  seems  as  effective  as  MBR.  This, 
in  turn,  implies  that  CAI  is  as  effective  as  ICAI.  Further  we  noted  the 
importance  of  treating  different  types  of  errors  differently;  e.g.,  a  con¬ 
sistent  mal-rule  should  be  treated  differently  to  a  slip. 

System  work:  The  initial  basic  PIXIE  system  has  now  been  enhanced  so  that  it 
can  diagnose  and  remediate  in  several  domains;  use  information  of  the 
student's  intermediary  working  to  reduce  the  number  of  remedial  models 
presented  to  a  student;  and  create  a  more  global  analysis  of  a  student's  per¬ 
formance. 

Teachers  as  diagnosticians :  this  experiment  concluded  that  exposure  to  the 
TPIXIE  program  did  enhance  the  trainee  teachers'  ability  to  diagnose  student 
errors . 

The  paper  concludes  with  an  extensive  set  of  conclusions  and  suggestions  for 


further  work. 


1 .  INTRODUCTION 


Despite  the  considerable  advances  which  have  taken  place  in  cognitive 
psychology,  and  in  particular  in  information  processing  psychology,  in 
the  last  two  decades,  the  field  does  not  have  a  prescriptive  theory  of 
instruction.  Consequently,  cognitive  and  instructural  psychology  are 
essentially  still  empirical  sciences,  although  they  have  a  growing 
corpus  of  knowledge  to  guide  decisions.  Several  cognitive  psychologists 
now  view  the  field  of  Intelligent  tutoring  systems  (ITSs)  as  offering  an 
important  test  bed  for  psychological  theories  (Anderson,  et  al,  1984); 
certainly  these  systems  have  the  important  characteristic  of  producing  a 
reproducable  environment.  The  lack  of  overall  theory  has  led  this 
research  group  to  be  particularly  rigorous  with  field  testing  of  its 
systems.  This  as  we  shall  see  has  been  a  sobering  exercise  for  the 
team,  but,  we  hope,  a  valuable  one  for  the  field  as  a  whole! 

Given  an  accurate  model  of  a  student's  performance  in  a  domain  (alge¬ 
bra),  the  focus  of  this  project  has  been,  how  does  one  build  an  effec¬ 
tive  remedial  system?  The  overall  design  assumed  that  remediation  would 
be  based  on  information  in  the  student  model,  and  that  such  a  remedial 
system  would  be  highly  effective.  It  was  then  proposed  to  further  fine- 
tune  this  remediation  to  tailor  it  to  student's  individual  aptitudes, 
and  learning  styles.  Indeed,  we  hoped  to  Implement  a  truly  adaptive 
Intelligent  tutoring  system,  namely  one  that  would  address  the 
aptitude-treatment  interaction  issue  (Cronbach  &  Snow,  1977).  It  was 
tacitly  assumed  that: 

MODEL-BASED-REMEDIATION  would  be  superior  to  RETEACHING. 

In  the  early  1980's,  due  to  the  influence  of  the  BUGGY  work  (Brown  & 
Burton,  1978)  and  the  carry  over  of  the  programming  debuggy  analogy,  it 
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was  generally  accepted  that: 


diagnosing  a  student's  error  was  much  more  complex  than  (subse¬ 
quent)  remediation,  i.e.,  remediation  followed  trivially  once  one 
had  an  accurate  student  model. 


highlighting  a  student's  specific  error(s)  would  create  cognitive 
dissonance  which  would  then  make  the  student  receptive  to  hearing 
the  "truth". 


by  and  large  it  was  expected  that  many  student  errors  would  be 
stable,  i.e.,  students  would  have  (reasonably)  stable  models  of  the 
task  domain. 


Brown  and  VanLehn  (1980)  suggest  that  the  metaphor  of  the  computer  bug 
may  have  been  misleading,  and  that  bug  migration  is  a  phenomenanwhich 
the  field  needs  to  take  seriously.  Sleeman  (1983)  noted  that  there  were 
different  types  of  errors  present  in  a  population  of  algebra  students, 
and  that  many  students  seem  to  follow  a  pattern  of  maturation  during 
their  understanding  of  a  topic: 


UNPREDICTABLE  ->  CONSISTENT  USE  of  MAL-RULES  ->  CORRECT 


This  project  has  produced  experimental  evidence  which  challenges  the 
assumptions  listed  above,  and  which  supports  the  idea  that  students' 
errors  vary  over  time  and  in  duration. 


Section  2  describes  the  studies  undertaken  to  determine  how  teachers 
diagnose  and  remediate  student  errors  in  algebra;  this  section  also 
includes  a  brief  description  of  the  remedial  sub-system  that  was  subse¬ 
quently  implemented.  Section  3  describes  a  series  of  experiments  under¬ 
taken  to  probe  the  effectiveness  of  the  remedial  sub-system;  specifi¬ 
cally,  we  investigated  its  effectiveness  against  simply  reteaching. 
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Section  4  describes  some  modifications  carried  out  to  the  PIXIE  system 
to  make  it  a  more  effective  tutor.  Section  5  describes  experiments  that 
measure  attempts  to  enhance  teachers'  diagnostic  capabilities.  Section 
6  reports  the  overall  conclusions  of  the  research,  and  section  7  sets 
out  an  ambitious  program  of  work  which  follows  from  this  study  and  its 


conclusions . 
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2.  FIELD  STUDIES  OF  TEACHERS  CARRYING  OUT  DIAGNOSIS  &  REMEDIATION 


In  order  to  begin  to  identify  what  makes  for  effective  diagnosis  and 

remediation  of  linear  algebraic  equations,  and  how  this  relates  to  the 

design  of  intelligent  tutoring  systems,  two  substantial  and  two  suppor- 

In  the  first  study. 

tlve  studies  of  master  teachers  were  undertaken.  /BtnaCi3  4  experienced 


teachers  were  shown  a  series  of  task-answer  pairs  which  had  been 
Incorrectly  worked  by  pupils,  and  asked  to  suggest  a  diagnosis  and  a 
suitable  remediation.  Although  there  was  often  a  common  error  in  each 
of  the  several  sets  of  tasks  presented,  this  was  not  pointed  out  to  the 
teachers.  Only  one  of  the  four  teachers  looked  for  a  common  error;  the 
others  were  happy  to  make  suggestions  on  a  task-by-task  basis.  The 


teachers  suggested  remediation  for  approximately  50%  of  the  errors,  it 
being  notable  that  when  multiple  errors  occurred  the  teachers  only  sug¬ 
gested  remediation  for  one  of  them  (the  most  important  error?). 
Further,  procedural  forms  of  remediation  were  suggested  more  than  twice 
as  frequently  as  conceptually-based  forms  of  remediation.  For  further 
details  of  this  study  see  Kelly  &  Sleeman  (1986). 

In  the  second  study  an  experienced  maths  teacher  was  observed  tutoring 
eight  students,  based  on  the  diagnosis  provided  for  each  student  by  the 
PIXIE  system.  This  teacher's  remediation  was  also  essentially  pro¬ 
cedural  but  it  did  have  two  striking  and  unexpected  features.  Firstly, 
this  teacher  having  been  told  that  the  student  was  doing  flipped  divi¬ 
sion  (i.e.,  transforming  tasks  of  the  form  5x“3  to  xa,5/3)  would  probe 


this  diagnosis  by  means  of  a  series  of  simpler  equations  to  determine 
the  reason  for  this.  For  instance,  did  the  student  know  how  to  write  5 
divided  by  3?  Did  he  know  how  to  cope  with  improper  fractions?  Or  was 


he  simply  lacking  a  general  procedure  to  solve  tasks  of  this  form?  Hav¬ 
ing  carried  out  this  further  probing  and  diagnosis,  the  teacher  would 
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then  proceed  to  give  the  student  procedurally-based  remediation. 
Because  of  the  way  in  which  these  diagnoses  had  been  confirmed,  we  refer 
to  this  as  causa 1-based-remedlat ion.  Secondly,  the  teacher  presented 
his  remediation  in  a  very  tentative  way;  taking  great  care  to  point  out 
to  the  student  the  steps  he  had  done  correctly,  and  the  reasonableness 
of  the  errors  made.  This  teacher  was  a  model  empathetic  tutor. 

The  first  supporting  study  was  a  series  of  interviews  with  3  Irish 
mathematics  teachers;  the  interviews  covered  how  they  taught  and  remedi¬ 
ated  algebra  bugs.  All  supported  the  need  to  teach  algebra  (and  one 
suspects  most  of  mathematics)  procedurally  on  the  grounds  of  effective¬ 
ness  and  time-constraints.  Further,  these  teachers  also  stressed  the 
need  not  "to  demolish  a  student's  confidence  by  pointing  out  a  series  of 
errors" . 

For  the  second  supporting  study,  we  held  a  workshop  for  algebra  teachers 
in  the  San  Francisco  area  to  discuss  the  teaching  and  remediation  of 
algebra  and  had  all  our  earlier  observations  about  the  centrality  of 
procedural  teacniug  confirmed. 

As  a  result  of  these  studies  we  concluded  that  the  vast  majority  of 
teachers  taught  algebra  procedurally.  It  was  therefore  decided  to 
Implement  a  remedial  system  which  had  that  form.  (It  was  decided  not  to 
implement  the  more  complex  causal-based-remediation  approach.) 

The  philosophy  behind  the  remediation  was  that  it  would  be  beneficial  to 
highlight  each  error,  to  explain  what  is  wrong,  and  what  should  have 
been  done.  Finally,  to  reinforce  these  points  the  task  was  reworked 
correctly,  with  a  commentary  on  each  of  the  steps.  Throughout  this 
document,  this  will  be  referred  to  as  model-based-remediation  (MBR);  an 
alternative  name  being  error-based-remediation  (figure  1).  Another  com- 
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monly  used  mode  of  remediation  available  within  RPIXIE  is  Reteaching, 
i.e.,  the  latter  half  of  the  protocol  in  figure  1  (namely,  that  below 
the  line). 

The  figure  illustrates  interaction  with  the  RPIXIE  system  during  the 
series  of  experiments  described  in  section  3.  The  system  developments 
described  in  section  A  were  carried  out  in  parallel  with  this  experimen¬ 
tal  work,  and  therefore  had  not  been  completed  at  the  time  the  instruc¬ 
tional  experiments  were  conducted. 


FIGURE  l 

An  Example  of  a  Student  Interaction:  MBR 


(This  protocol  is  slightly  edited) 

Solve  (f  *  Finished,  q  ■»  Quit  early) 

7x  ■  2  *  2 
S:  7x  *  4 
S:  f 

Simplify  further  7x  =  4 
S:  x  -  4 
S:  f 

Thank  you, 

but  you  didn't  get  that  one  right. 

We  will  now  look  at  that  task  again. 

Here  we  are  considering  the  task  7x  =  2  *  2 

Did  you  work  out  the  task  something  like  this?: 

7x  -  2  *  2 
7x  -  4 
lx/7  -  4 
x  ■  4 

Please  type  YES  or  NO:  y  (Student  replies) 

It  seems  that  you  changed  7x  =■  4  to  something  like  7x/7  *  4 
This  is  WRONG. 

Do  NOT  only  divide  ONE  side  by  the  coefficient. 

You  should  divide  both  sides  of  the  equation  by  the  coefficient  to  change 
7x  *  4  to  lx/7  -  4/7 


Now  let  me  show  you  how  you  SHOULD  have  worked  the  whole  task 


7x  -  2  *  2 


multiply  2  numbers 


divide  both  sides  of  the  equation  by  the 
coefficient 


lx/7  -  4/7 


divide 


x  -  4/7 


Fini shed 


3.  THE  SERIES  OF  EXPERIMENTS  ON  THE  RELATIVE  EFFECTIVENESS 
OF  THE  SEVERAL  REMEDIAL  TREATMENTS 

As  noted  in  the  introduction,  the  relative  effectiveness  of  different 
forms  of  remediation  was  the  central  issue  in  this  research.  The  inten¬ 
tion  was  to  build  a  highly  adaptive,  intelligent  tutoring  system.  As  a 
first  step  in  this  process,  we  attempted  to  verify  the  hypothesis  that 
MBR  (Model-based-remediation)  was  superior  to  Reteaching.  Subsequent 
experimentation  was  to  establish  the  optimum  conditions  for  students 
with  differing  aptitudes. 

Essentially,  we  could  find  no  evidence  supporting  the  greater  effective¬ 
ness  of  MBR  for  algebra  when  taught  procedurally  (or  more  specifically, 
not  for  our  target  population).  The  rest  of  this  section  discusses  in 
some  detail  the  main  points  of  the  experiments  conducted  to  investigate 
this  issue.  See  Martinak,  Sleeman,  Kelly,  Moore  &  Ward  (1987)  for  a  more 
detailed  description  of  this  series  of  experiments. 

After  a  series  of  pilot  studies  to  verify  that  students  were  able  to 
easily  use  the  RPIXIE  system,  we  ran  our  first  formal  experiment.  This, 
and  the  subsequent  studies  followed  a  pretest-intervention-posttest 
design.  For  a  class  of  24  13-14  year  old  pupils  who  were  below  average 
in  mathematics,  it  was  found  that  MBR  and  Reteaching  by  RPIXIE  were  both 
more  effective  than  merely  telling  the  student  whether  the  task  had  been 
worked  correctly.  However,  MBR  was  not  better  than  reteaching;  the  per¬ 
formance  of  these  groups  were  comparable.  This  was  a  surprising  result. 

This  result  led  us  to  believe  that  the  issues  of  remediation  were  much 
more  subtle  than  initially  suspected,  and  therefore  we  decided  to  repli¬ 
cate  the  study  using  human  tutors .  This  second  study  gave  essentially 
the  same  result.  It  was  then  hypothesised  that  these  results  may  have 


occurred  because  the  treatments  had  not  involved  the  students  suffi¬ 
ciently  in  the  remediation,  or  that  alternatively,  PIXIE's  corrective 
comments,  targetted  at  those  part(s)  cf  the  task  the  student  had  worked 
incorrectly,  were  failing  to  create  the  expected  cognitive  dissonance. 
A  third  experiment  was  therefore  conducted  with  4  treatment  groups, 
namely  MBR,  MBR  +  Cognitive  Engagement  (here  the  student  was  asked  to 
reteach  to  the  tutor  the  correct  procedures),  MBR  +  Cognitive  Dissonance 
(here  the  student  was  required  to  substitute  his  (incorrect)  solution 
back  into  the  original  equation,  thereby  demonstrating  that  his  solution 
was  wrong)  and  Reteaching.  Again  the  results  for  all  4  groups  were  com¬ 
parable. 

This  additional  puzzling  result  led  to  a  further  range  of  hypotheses; 
specifically,  to  suppose  that  many  errors  are  in  fact  unstable,  that  is, 
the  same  student  given  a  comparable  task  on  different  occasions  would 
work  the  task  differently.  Indeed,  a  retrospective  analysis  of  the  last 
experiment,  showed  that  only  18-26%  of  errors  made  on  the  pre-test  were 
present  on  the  same  items  one  week  later  during  the  tutorial.  (Please 
note  that  this  is  a  very  stringent  requirement  for  stability  of  errors; 
a  more  lenient  criterion  is  introduced  later.) 

The  fourth  experiment  in  the  series  was  explicitly  designed  to  investi¬ 
gate  the  issue  of  stability.  A  test  measure  containing  51  items  was 
developed  -  17  sets  of  3  comparable  items.  This  measure  was  given  twice 
at  a  week'6  interval.  The  intent  of  this  study  was  to  identify  errors 
that  were  stable  over  time,  and  then  to  provide  human  tutoring  on  those. 
On  this  occasion,  for  an  error  to  be  classified  as  "stable",  it  had  to 
occur  at  least  twice  on  both  pretests.  Students  with  stable  errors  were 
assigned  randomly  to  one  of  three  conditions,  namely  MBR,  Reteach  or  the 
control  group.  Both  the  MBR  and  Reteach  groups  were  tutored  indlvidu- 


ally  for  a  50  minute  period;  the  control  group  took  only  the  2  pretests 
and  the  posttest.  Below  we  give  the  average  number  of  occurences  of  the 
19  most  common  errors  for  the  3  groups  (these  19  errors  account  for  80% 
of  the  errors  in  this  study): 


Pretest 

1 

Pretest 

2 

Posttest 

Number  of 
students  in  group 

MBR 

19.2 

18.5 

10.3 

9 

Reteaching 

29.4 

22.6 

9.9 

8 

Control 

32.0 

26.4 

26.0 

8 

These  figures  suggest  that  errors  are  fairly  stable  from  Pretest-1  to 
Pretest-2,  however,  errors  decrease  substantially  from  the  pretest  to 
tutoring,  presumably  due  to  the  effects  of  tutoring.  An  error  once 
tutored,  tends  not  to  reappear  in  the  same  tutorial  session;  addition¬ 
ally,  tutoring  appears  to  suppress  attentional  errors*.  These  results 
also  show  that  there  are  significantly  fewer  errors  on  the  posttest  for 
the  treatment  groups  when  compared  with  the  control  group;  again  both 
treatment  groups  were  highly  comparable.  A  further  analysis  of  the  data 
given  in  the  above  table  shows  that  the  percentage  decrease  in  the 
number  of  stable  errors  between  the  first  pretest  and  the  posttest  for 
MBR,  Reteaching  and  the  control  group  was  respectively  46%,  66%  and  19%. 
This  suggests  that  although  some  errors  are  unstable,  tutoring  is  effec¬ 
tive  at  remediating  stable  errors,  but  again  MBR  is  not  more  effective 
than  reteaching .  (These  observations  are  consistent  with  Sleeman  (1983) 
who  reported  an  experiment  in  which  the  MBR  group  greatly  out  performed 
the  control  group.) 

Several  additional  experiments  were  run  with  RPIXIE  which  generally  sup¬ 
ported  c ..  result  that  MBR  and  Reteaching  were  very  comparable;  see  Mar- 

*  Errors  caused  by  lack  of  attention  to  the  task. 


tlnak,  et  al.  (1987)  for  details 


These  results  will  now  be  interpreted  within  the  framework  of  the 
assumptions  listed  in  the  introductory  section.  Explicitly,  the  results 
from  our  experiment  will  be  related  to  each  of  these  assumptions. 


Assumption  1.  (Diagnosing  a  student's  error  is  much  more  complex  than 


remediation. )  Even  if  a  diagnosis  has  been  made  correctly,  remediation 
involves  conveying  that  information  to  the  student  in  a  way  that  is 
intelligible.  Much  of  our  social  knowhow  is  about  communication,  e.g., 
phrasing  a  request  so  that  it  will  appear  attractive  to  the  hearer  etc. 
Remediation  is  no  less  subtle;  the  teachers  in  our  study  (section  2) 
seemed  to  understand  that.  (Unfortunately,  RPIXIE  did  not!) 


Conclusion:  Those  of  us  who  have  been  enamoured  with  the  technicalities 
of  inferring  student  models,  had  overlooked  the  complexities  inherent  in 
subsequently  communicating  the  remediation.  (Note:  this  is  not  to  say 
that  diagnosis  is  a  simple  matter). 


Assumption  2^.  (Highlighting  £  student's  specific  error(s)  would  create 
cognitive  dissonance.)  This  set  of  experiments  clearly  established  that, 
for  this  topic  and  teaching  approach,  reteaching  and  model-based  remedi¬ 
ation  was  better  than  no  treatment  at  all,  but  that  reteaching  and 
model-based-remediation  were  highly  comparable.  This  initially  surpris¬ 
ing  result  indicates  that,  for  this  topic  and  students,  CAI  would  have 
been  just  as  effective  as  ICAI  (as,  of  course,  CAI  programs  are  quite 
capable  of  storing  pre-worked  solutions  to  tasks).  Secondly,  one 
interpretation  of  the  fact  that  students  did  equally  well  on  Reteaching 
as  on  MBR  is  that  the  students  in  the  Reteaching  group  were  self- 
correcting.  That  is,  they  compared  their  incorrect  working  with  the 
correct  form,  and  generally  inferred  their  own  errors.  Again  this 
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interpretation  is  consistent  with  other  experiments  on  "passive”  versus 
"active"  instruction,  and  is  consistent  with  the  literature  on  meta¬ 


cognition,  (Brown,  1978). 


This  explanation  would  explain  why  immediate  feedback  is  so  important 
for  learning  (Lewis  &  Anderson,  1985).  (If  the  critical  component  is  the 
provision  of  virtually  instant  feedback  then  this  would  explain  why  the 
feedback  provided  by  teachers  on  exercises  a  week  or  so  after  the  event 
is  also  not  very  effective.) 

Assumption  3^  (Many  student  errors  would  be  stable . ) 

These  experiments  have  supplied  further  evidence  for  the  series  for 
error-types  suggested  by  Sleeman,  (1983).  That  is,  one  should  expect  to 
find  students  with  a  range  of  types  of  errors,  including: 

strongly  held  consistent  mal-rules. 

related  "families"  of  mal-rules  which  are  applied  "randomly”. 

-  passing  attentional  errors  (like  adding/omitting  signs). 

-  guesses  because  the  tutor  or  the  program  demands  an  answer.* 

-  mental-slips  and  casual  (typing)  errors. 

When  the  investigators  reviewed  their  tapes  with  this  classification  in 
mind  they  found  strong  supporting  evidence  for  it,  and  reported  that  it 
was  clear  that  students  had  varying  confidence  concerning  the  correct¬ 
ness  of  the  different  types  of  errors.  This  analysis  has  considerable 


implications  for  remediation. 


Clearly,  one  might  wish  to  highlight 


*  V 


*  After  the  1981  experiment,  a  facility  was  added  to 
PIXIE  to  allow  students  to  QUIT  any  task,  so  as  to 
avoid  this  situation. 


discuss  In  detail  a  known  stable  error,  but  a  detailed  discussion  of  a 
pure  guess  might  be  counter-productive  as  it  might  help  "cement"  the 
incorrect  form.  How  to  phrase  remedial  comments,  as  we  have  seen,  is 
also  of  vital  importance.  The  version  of  RP1XIE  used  in  these  experi¬ 
ments  lacks  the  sophistication  of  being  able  to  make  a  "global"  diag¬ 
nosis  of  a  student's  error  pattern.  However,  the  analysis  of  these 
experiments  suggests  that  this  may  be  an  important  issue.  Section  4 
discusses  a  pilot  system  which  produces  more  global  diagnoses,  i.e., 
diagnoses  which  "explain”  a  series  of  errors  -  possibly  which  occurred 
in  various  task-sets. 


Further,  the  above  analysis  led  to  the  suggestion  that  because  the  stu¬ 
dents  had  been  taught  procedurally  they  might  not  have  acquired  an 
(overall)  mental  model  for  the  domain.  We  further  hypothesized  that  had 
they  been  taught  conceptually,  then  there  would  have  been  a  greater 
chance  of  the  student  forming  a  mental  model,  and  thus  such  students 
should  exhibit  more  stable  errors.  We  were  unable  to  find  any  Aberdeen 
secondary  schools  that  taught  algebra  conceptually.  So  this  hypothesis 
remains  untested. 


The  implications  of  the  series  of  diagnostic/remedial  experiments  are 
discussed  in  some  detail  in  sections  6  &  7. 
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A  sub-system  has  been  Implemented  which  produces  a  more  global 
analysis  of  a  student's  performance  on  a  wide  range  of  tasks.  Pre¬ 
viously,  the  most  commonly  used  mode  of  the  RPIXIE  system  produced 
a  diagnosis  (and  if  needed  remediation)  which  was  specific  to  a 
particular  task.  This  was  too  myopic  a  view.  The  current  sub-system 
when  it  is  shown  a  student  analysis  record  of  the  following  form: 


5x“15  “>  x-3  and  5x*7  *>  x“2 

suggests  that  it  is  probable  the  student  can  correctly  solve  tasks  of 
the  form  ax*b  when  b  is  divisible  by  a,  but  not  when  b  is  indivisible  by 
a.  This  sub-system  also  suggests  sets  of  tasks  that  should  be  used  in 
tutoring  such  a  student. 


,  * 
•'Wj 

■V.',' 


■ttl 


Similarly,  given  the  following  student  performance: 

5x+3-ll  “>  x»8/5  and  5x+3x*ll  ”>  x+x=*l  1-3-5 
this  subsystem  would  suggest  that  the  student  can  successfully  solve 
tasks  of  the  form  ax+b=*c,  but  not  those  of  the  form  ax+bx-c,  suggesting 
that  the  student  does  not  know  how  to  combine  x-terms. 

Various  software  aides  have  been  produced  for  the  developer  of  new 
knowledge  bases.  These  include  a  program  which,  given  the  template  for  a 
level  and  the  set  of  models,  generates  the  set  of  most  discriminating 
tasks.  (Ideally  these  tasks  would  be  completely  discriminatory.)  Another 
package  checks  for  syntax  errors  and  certain  semantic  inconsistencies  in 
knowledge  bases  (e.g.,  entities  being  referenced  but  not  defined.) 

Although,  not  sponsored  by  this  project,  we  have  implemented  during  this 
period  a  system  INFER*,  which  is  able  to  infer  mal-rules  from  previously 
unknown  protocols,  given  additional  background  knowledge  and  some  focus¬ 


ing  heuristics.  Additionally,  we  have  implemented  a  system,  MALGEN, 
which  applies  perturbatious  to  correct  rules,  and  filters  out  "variants" 


which  violate  certain  meta-constraints.  For  details  of  these  approaches 
see  Sleeman  (1982)  and  Sleeman,  Hirsh  &  Kim  (1987) 

The  critical  issue  of  field-testing  these  new  sub-systems,  and  the  sub¬ 
sequent  integration  of  these  several  components  into  a  further  enhanced 
PIXIE-system  is  discussed  in  section  7. 


5. 


AIDES  FOR  HELPING  TEACHERS  BE  BETTER  DIAGNOSTICIANS 


The  TPIXIE  program  drew  some  of  its  inspiration  from  the  BUGGY  program 
(Brown  &  Burton,  1978)  which  presents  trainee  teachers  with  incorrectly 
worked  subtraction  tasks  and  then  asked  them  to  suggest  additional  tasks 
and  indicate  how  that  same  student,  if  consistent,  would  work  them.  The 
major  difference  between  the  BUGGY  and  TPIXIE  is  the  domain  of  applica¬ 
tion. 

A  pilot  study  with  the  system  in  California,  showed  that  trainee- 
teachers  who  used  TPIXIE  were  somewhat  better  than  those  in  the  control 
group  who  merely  worked  algebra  tasks.  However,  the  trainee-teachers 
suggested  that  the  example-set  be  changed  so  that  more  difficult  tasks 
would  be  encountered  earlier  in  the  session.  Also  the  analysis  of  the 
data  showed  that  the  transfer  of  knowledge  to  new  but  highly  analogous 
tasks  was  not  very  substantial  (Schneider,  Kelly,  Blando,  Martinak, 
Sleeman  &  Snow,  1986). 

A  further  experiment  with  an  enhanced  TPIXIE  system  was  conducted  in 
Aberdeen  with  a  larger  sample  of  trainee-teachers;  for  details  of  the 
system  and  the  study  see  Kelly,  Sleeman,  Ward  &  Martinak,  1987.  The 
encouraging  trend  of  the  pilot  study  was  confirmed.  The  subjects  on 
TPIXIE  were  significantly  better  at  diagnosing  algebra  errors  on  the 
posttest  than  those  in  the  control  group.  The  study  also  recommended 
further  refinements  to  the  methodology  and  test  instrument  prior  to 
replication. 

If,  as  section  3  suggests,  Reteaching  is  as  effective  as  MBR,  then  there 
is  less  point  in  training  teachers  to  be  good  diagnosticians  than  we  had 
previously  thought.  Nevertheless,  one  could  make  the  case,  that  being 
aware  of  possible  student  errors  would  make  them  better  classroom  teach- 
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6.  CONCLUSIONS 


Listed  below  are  the  conclusions  drawn  from  a  series  of  PIXIE  related 


studies: 


-  Virtually  all  teachers  encountered  in  this  study  in  American, 
English,  Irish  and  Scottish  schools  taught  algebra  procedurally •  (Sec¬ 
tion  2) 


-  Model-based-remediation  and  reteaching  using  humans  as  tutors  are 
both  more  effective  than  no  tutoring.  (Section  3) 


-  Model-based-remediation  and  simply  reteaching  are  equally  effective 
when  the  tutoring  is  carried  out  by  humans.  This  leads  to  the 
hypothesis  discussed  in  section  3  that  the  students  in  the  reteaching 
group  were  self-correcting,  and  the  conclusion  that,  for  some  domains 
and  some  student  populations,  CAI  would  be  as  effective  as  ICAI.  (Sec¬ 
tion  3) 


B 


-  In  the  last  study,  a  significant  number  of  students  had  stable 
errors  which  accounted  for  approximately  80%  of  errors  recorded.  There 
appeared  to  be  a  bigger  percentage  of  unstable  errors  when  students 
interacted  with  the  computer,  namely  with  RPIXIE  (section  3). 

There  is  further  evidence  that  students  make  a  wide  variety  of  types 
of  errors  (from  "hard"  bugs  to  careless  (typing)  errors)  and  that  stu¬ 


dents  hold  beliefs  of  varying  strengths  about  these  error  types.  (Sec¬ 
tion  3;  see  paragraph  on  Assumption  3). 

-  The  PIXIE  system  has  been  further  enhanced,  so  that  it  should  be 
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more  human-like  in  its  tutoring  -  having  the  ability  to  tutor  in  several 
domains  and  to  form  "global"  diagnoses.  (These  facilities  now  need  to 
be  thoroughly  field-tested.)  (Section  4) 

-  It  is  possible  to  train  teachers  to  diagnose  error  patterns  in  exam¬ 
ples  wrongly  worked  by  students.  (Section  5) 
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FURTHER  WORK  SUGGESTED  BY  THIS  STUDY  DIAGNOSIS  AND  REMEDIATION 


-  An  extensive  set  of  field-trials  is  required  to  determine  under  what 
conditions  Reteaching  is  as  effective  as  model-based-remediation.  Edu¬ 
cators,  as  well  as  those  in  the  ITS  field,  need  to  know  how  this  is 
influenced  by  subject  domain,  age  of  student  and  teaching  approach. 
(Probe  whether  conceptual  teaching  leads  to  more  stable  mental  models). 

-  Run  a  study  to  further  investigate  the  effectiveness  of  an  MBR  + 
Cognitive  Dissonance  condition,  modelled  after  Swan  (1983). 

-  Replicate  the  study  to  investigate  the  stability  of  errors  in  alge¬ 
bra  with  a  larger  N  and  with  the  requirement  that  each  stable  error 
should  be  represented  in  all  conditions. 

Run  a  study  to  compare  rates  of  attentional  errors  with  human  and 
computer  tutoring  (both  MBR  and  Reteaching). 

-  Investigate  how  the  stability  of  errors  and  models  might  be  influ¬ 
enced  by  6ub ject-domain ,  student  age,  level  of  attainment  and  teaching 
approach.  As  a  secondary  issue,  one  would  wish  to  investigate  the  extent 
to  which  students  have  a  distinguishable  conceptual  model  and  whether 
the  range  of  error-types  found  in  algebra  are  present  in  other  domains. 

-  Run  an  experiment  in  which  the  student  is  distracted  immediately 
after  he  has  done  a  task,  and  before  he  is  shown  the  Reteaching.  It  was 
suggested  above  that  one  reason  why  reteaching  was  as  effective  as 
model-based-remediation,  might  be  because  the  student  was  essentially 
self-correcting.  (If  this  hypothesis  is  correct  the  "distracted"  stu- 
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dents  would  do  considerably  worse  than  those  who  are  not.]  Alterna¬ 
tively,  run  a  study  in  which  there  is  a  differential  time  gap  between 
working  the  task  and  receiving  feedback. 


-  Get  tutors  to  review  the  extensive  working  from  a  student  and  arti¬ 
culate  a  "global”  diagnosis;  have  the  tutor  remediate  a  student  on  the 
basis  of  this  analysis.  Compare  the  effectiveness  of  this  remediation 
with  Reteaching. 


-  Compare  (human)  empathetic  tutoring  with  "neutral”  tutoring,  ensur¬ 
ing  that  the  instructional  context  of  the  material  tutored  is  identical. 
[This  experiment  would  need  to  be  run  for  a  variety  of  personality  types 
as  well  as  for  the  factors  noted  earlier.] 


-  Compare  the  effect  of  a  human  tutor  giving  detailed  causal-based- 
remediation  (see  definition  in  section  2)  against  "straight"  reteaching. 


System 


-  Run  extensive  field  trials  to  determine  the  effectiveness  of  the 
multi-domain  diagnosis/remedial  system,  and  of  the  system  which  can  form 
global  diagnoses.  [TF  this  is  successful,  then  a  system  should  be 
implemented  which  integrates  the  higher-order  diagnoses,  multiple 
knowledge  bases,  as  well  as  the  INFER*  algorithm  (which  is  able  to  infer 
previously  unknown  mal-rules  from  protocols).] 


TPIXIE  (Studies  to  see  if  teachers  can  be  taught  to  diagnose) 


-  Repeat  the  TPIXIE  current  study  with  a  refined  instrument;  and 
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investigate  again  whether  transfer  is  effective. 
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